logo

SEA Shark and Ray Research and Conservation Workshop


Introductions

Who am I?


Vinay is a Research Scientist at the Australian Institute of Marine Science. He is an ecologist that is particularly interested in using spatio-temporal datasets to understand animal movements and distributions patterns. He has considerable experience using R to analyse and visualise large and complex spatial datasets. He has developed R code and packages to analyse 2 and 3 dimensional movement patterns of animals using acoustic telemetry data from single study sites to continental scale arrays. Vinay’s R codes can be found on his github page.



Course outline

In this course you will learn about different ways to analyse and interpret your aquatic telemetry datasets using R. This workshop will demonstrate how R can make the processing of spatial data much quicker and easier than using standard GIS software! At the end of this workshop you will also have the annotated R code that you can re-run at any time, share with collaborators and build on with those newly acquired data!

I designed this course not to comprehensively cover all the tools in R, but rather to give you an understanding of options on how to analyse your telemetry data (both from satellite and acoustic tags). Every new project comes with its own problems and questions and you will need to be independent, patient and creative to solve these challenges. It makes sense to invest time in becoming familiar with R, because today R is the leading platform for environmental data analysis and has some other functionalities which may surprise you!


This R workshop is intended to run across 3 sessions.


  • Session 1: Getting familiar with R and spatial data
  1. Import and explore datasets tidyverse
  2. Working with Spatial objects using sf, ggspatial and mapview


  • Session 2: Working with satellite telemetry data
  1. Understanding the data structure from satellite tags
  2. Processing satellite tag data using the aniMotum package
  3. Visualising satellite tag data using the ggspatial package


  • Session 3: Working with passive acoustic telemetry data
  1. Understanding the data structure from acoustic telemetry data
  2. Using the VTrack R package to explore patterns in animal detections and dispersal
  3. Using the remora R package to interactively explore your telemetry data



Course Resources

The course resources will be emailed to you prior to the workshop. However, you can also access the data and scripts we will work through in this course following these steps:

1. Download the course materials from this link.

2. Unzip the downloaded file. It should have the following two folders and a html file:

  • Code folder
  • Data folder
  • SEA-workshop2023.html file

3. Save all the files in a location on your computer you can find again during the workshop



Back to top



Software installation


Processing and analysing large datasets like those from animal telemetry work can require a huge investment in time: rearranging data, removing erroneous values, purchasing, downloading and learning the new software, and running analyses. Furthermore merging together Excel spreadsheets, filtering data and preparing data for statistical analyses and plotting in different software packages can introduce all sorts of errors.

R is a powerful language for data wrangling and analysis because…

  • It is relatively fast to run and process commands
  • You can create repeatable scripts
  • You can trace errors back to their source
  • You can share your scripts with other people
  • It is easy to identify errors in large data sets
  • Having your data in R opens up a huge array of cutting edge analysis tools.
  • R is also totally FREE!


Installing packages

Part of the reason R has become so popular is the vast array of packages that are freely available and highly accessible. In the last few years, the number of packages has grown exponentially > 10,000 on CRAN! These can help you to do a galaxy of different things in R, including running complex analyses, drawing beautiful figures, running R as a GIS, constructing your own R packages, building web pages and even writing R course handbooks like this one!

Let’s suppose you want to load the sf package to access this package’s incredible spatial functionality. If this package is not already installed on your machine, you can download it from the web by using the following command in R.

install.packages("sf", repos='http://cran.us.r-project.org')

In this example, sf is the package to be downloaded and ‘http://cran.us.r-project.org’ is the repository where the package will be accessed from.


More recently, package developers have also used other platforms like GitHub to house R packages. This has enabled users to access packages that are actively being updated and enable developers to fix problems and develop new features with user feedback.

The remotes and devtools R packages have enabled the installation of packages directly from platforms like GitHub. For example, if we want want to download the VTrack package from the github repository, we can use the install_github() package to do it like this:

remotes::install_github("rossdwyer/VTrack")

Once installing the packages we want to use, we need to load them using the library() function.

library(sf)
library(VTrack)

Installation instructions:

For this course, make sure you have downloaded and installed the most updated versions of the following software:


1. Download R for your relevant operating system from the CRAN website

2. Download RStudio for your relevant operating system from the RStudio website

3. Once you’ve installed the above software, make sure you install the following packages prior to the start of the course

## Packages that are on CRAN

install.packages(c("tidyverse",
                   "sf",
                   "ggspatial",
                   "leaflet",
                   "remotes"))

## Install packages from GitHub and other external repositories

remotes::install_github("r-spatial/mapview", build_vignettes = TRUE)
remotes::install_github("rossdwyer/VTrack", build_vignettes = TRUE)
remotes::install_github('IMOS-AnimalTracking/remora', build_vignettes = TRUE)
install.packages("aniMotum", 
                 repos = c("https://cloud.r-project.org",
                 "https://ianjonsen.r-universe.dev"),
                 dependencies = TRUE)

When downloading from GitHub, the R console will ask you if it should also update packages. In most cases, you can proceed with updating packages only found on CRAN (option [2]).



Back to top



Session 1

Getting familiar with R and spatial data


The process of turning raw data into publishable results is a highly involved, particularly with telemetry studies. Tracking data sets are becoming larger, and larger as they are being gathered over longer time periods, over larger spatial extents and at increasing temporal resolutions. While this is increasing our ability to detect subtle patterns, these data sets are becoming vast and require analytical tools that easily handle, manipulate and visualise these complex datasets.

In this session we are going to work with a data set containing detection data from 3 Australian Blacktip Sharks (Carcharhinus tilstoni) shown in the image above. These animals were captured and tagged in Townsville, Australia roughly one month prior to the landfall of Cyclone Yasi in 2011. Blacktip sharks were tracked using a network of acoustic hydrophones deployed in a grid pattern on the East and West side of Cleveland Bay.

Telemetry data from these sharks were analysed alongside 45 others from five species to examine movement patterns of coastal sharks before, during and after three extreme weather events in Australia (Cyclone Yasi and Tropical Storm Anthony, 2011) and the US (Tropical Storm Gabrielle, 2001). You can read more about that study here.


You can download the datasets we will use for this session from here. The link should download a zip file with two .csv files that we will use during this session.


The web map of detection data we will explore by the end of Session 1.

Click on the tabs below to progress through the first session. Lets start with exploring datasets using tidyverse:


Import and explore datasets using the tidyverse


Before we can analyse these data, we first need to read this dataset into R. As with most acoustic detection datasets exported from VUE or other acoustic telemetry data management software, our data set is in the ‘comma sperated value’ (.csv) format.

A .csv file can simply be imported into R using the read.csv base function, and by telling R which file to load (Blacktip_ClevelandBay.csv) and where to find it (i.e. in the ‘Data’ folder).

# Load the blacktip shark data using base read.csv function
blacktip <- read.csv('location_of_csv_file', header = TRUE)


A note about Excel files

Don’t use ‘.xlsx’ or ‘.xls’ files for saving data. The problem with ‘.xls’ and ‘.xlsx’ files are that they store extra info with the data that makes files larger than necessary and Excel formats can also unwittingly reformat or alter your data!

A stable way to save your data is as a ‘.csv’ file. These are simply values separated by ‘commas’ and rows defined by ‘returns’. If you select ‘Save as’ in Excel, you can choose ‘.csv’ as one of the options. If you open the .csv file provided in the ‘Data’ folder using a text editor, you will see it is just words, numbers and commas.




What is the tidyverse?

The tidyverse is the collective name given to suite of R packages designed mostly by Hadley Wickham. This is becoming an increasingly popular set of packages that share an underlying design philosophy, grammar, and data structure. You can learn more about all the features of these packages from the free online course developed by the package creators here.


Members of the tidyverse

readr, broom, dplyr, forcats, ggplot2, haven, httr, hms, jsonlite, lubridate, magrittr, modelr, purrr, readr, readxl, stringr, tibble, rvest, tidyr, xml2

The advantage of the tidyverse is that most of these packages (but not all!) can be loaded simultaneously using a single line of code

library(tidyverse)


The tidyverse version of the above code will be read_csv() function. The main difference being the data imported as a tibble data frame. The advantage of a tibble database is that all the columns will be formatted correctly, with the package guessing what the best format may be.

blacktip <- read_csv('data/Session 1/Blacktip_ClevelandBay.csv')

# You can also use read_csv to input data directly from a website URL
blacktip <- 
  read_csv('https://raw.githubusercontent.com/vinayudyawer/SEA-workshop2023/main/data/Session%201/Blacktip_ClevelandBay.csv')

head(blacktip)
## # A tibble: 6 × 10
##   date_time           receiver   transmitter transmitter_name transmitter_serial
##   <dttm>              <chr>      <chr>       <chr>                         <dbl>
## 1 2011-01-23 16:41:31 VR2-5052   A69-1303-6… Ana                        11237964
## 2 2011-01-23 16:43:35 VR2-5052   A69-1303-6… Ana                        11237964
## 3 2011-01-23 16:48:25 VR2-5052   A69-1303-6… Ana                        11237964
## 4 2011-01-23 21:07:34 VR2W-1049… A69-1303-6… Ana                        11237964
## 5 2011-01-23 21:12:06 VR2W-1049… A69-1303-6… Ana                        11237964
## 6 2011-01-24 10:57:27 VR2W-1041… A69-1303-6… Ana                        11237964
## # ℹ 5 more variables: sensor_value <lgl>, sensor_unit <lgl>,
## #   station_name <chr>, latitude <dbl>, longitude <dbl>


Pipes %>%


Now that we’ve successfully loaded in our tracking dataset, lets start having a closer look at the data using pipes %>%

  • Originally from the magrittr package but has been imported to the tidyverse.
  • %>% is an infix operator. This means it takes two operands, left and right.
  • ‘Pipes’ the output of the last expression/function (left) forward to the first input of the next funciton (right).
# For example, to see what class our data is in, we could use this code...
class(blacktip)

# Alternatively in the tidyverse we could use this code...
blacktip %>% class()


Benefits of pipes %>%

  • Functions flow in natural order that tells story about data.
  • Code effects are easy to reason about by inserting View() or head() into pipe chain.
  • Common style makes it easy to understand collaborator (or your own) code.

We can have a quick look at the data by typing:

# Now insert functions into the pipe chain
blacktip %>% View()
blacktip %>% head() # first 6 rows by default
blacktip %>% tail(10) # specify we want to look at the last 10 rows

This functionality is particularly useful if the data is very large!

Note the (), as opposed to the [] we used for indexing. The () signify a function.

We can look at the data more closely using the nrow(), ncol(), length(), unique(), str() and summary() functions.

blacktip %>% nrow() # number of rows in the data frame
blacktip %>% ncol() # number of columns in the data frame
blacktip %>% str() # provides internal structure of an R object
blacktip %>% summary() # provides result summary of the data frame
# pipes can be used for single column within data frames
blacktip$transmitter_name <-
  blacktip$transmitter_name %>% as.factor()

# pipes are used to conduct multiple functions on the dataset in a certain order
blacktip %>% 
  subset(transmitter_name == "Colin") %>% # subset dataset to include only detections by 'Colin'
  nrow() # number of rows (i.e. detections) from 'Colin'

Pipes can also be used to pre-process our data before plotting them. Lets now use pipes to plot a simple barplot of the number of Colins detections at each reciever.

blacktip %>% 
  subset(transmitter_name == "Colin") %>% # subset dataset to include only detections by 'Colin'
  with(table(station_name)) %>% # create a table with the number of rows (i.e. detections) per receiver
  barplot(las = 2, xlab = "Receiver station", ylab = "Number of Detections") # barplot of number of Colin's detections recorded per receiver


dplyr

  • dplyr is the data wrangling workhorse of the tidyverse.
  • Provides functions, verbs, that can manipulate data into the shape you need for analysis.
  • Has many backends allowing dplyr code to work on data stored in SQL databases and big data clusters.
  • Works via translation to SQL. Keep an eye out for the SQL flavour in dplyr

Basic vocabulary

  • select() columns from a tibble
  • filter() to rows matching a certain condition
  • arrange() rows in order
  • mutate() a tibble by changing or adding rows
  • group_by() a variable
  • summarise() data over a group using a function

Check out this useful online cheatsheet for data wrangling.


select

We can use the select function in dplyr to choose the columns we want to include for our analyses and plotting

# Select the rows we are interested in
blacktip <- 
  blacktip %>% 
  select(date_time, latitude, longitude, receiver, station_name, transmitter_name, transmitter, sensor_value) %>% # columns we want to include
  select(-sensor_value) # the minus symbol denotes columns we want to drop

head(blacktip)
## # A tibble: 6 × 7
##   date_time           latitude longitude receiver  station_name transmitter_name
##   <dttm>                 <dbl>     <dbl> <chr>     <chr>        <fct>           
## 1 2011-01-23 16:41:31    -19.3      147. VR2-5052  E18          Ana             
## 2 2011-01-23 16:43:35    -19.3      147. VR2-5052  E18          Ana             
## 3 2011-01-23 16:48:25    -19.3      147. VR2-5052  E18          Ana             
## 4 2011-01-23 21:07:34    -19.3      147. VR2W-104… E2           Ana             
## 5 2011-01-23 21:12:06    -19.3      147. VR2W-104… E2           Ana             
## 6 2011-01-24 10:57:27    -19.3      147. VR2W-104… E1           Ana             
## # ℹ 1 more variable: transmitter <chr>


filter and arrange

We can use these functions to subset the data to rows matching logical conditions and then arrange according to particular attributes

blacktip %>%
  filter(transmitter_name == "Ana") %>%
  arrange(date_time) # arrange Ana's detections in chronological order

blacktip %>%
  filter(transmitter_name == "Bruce") %>%
  arrange(desc(date_time)) # arrange Bruce's detections in descending chronological order


group_by and summarise

Determine the total number of detections for each tagged shark

blacktip %>%
  group_by(transmitter_name) %>%
  summarise(NumDetections = n()) # summarise number of detections per tagged shark

blacktip %>%
  group_by(transmitter_name, station_name) %>%
  summarise(NumDetections = n()) # summarise number of detections per shark at each receiver


mutate

Adding and removing data to the data frame through a pipe

blacktip <- 
  blacktip %>%
  mutate(date = as.Date(date_time)) %>% # adding a column to the blacktip data with date of each detection
  mutate(transmitter = NULL) # removing the `Transmitter` column

head(blacktip)
## # A tibble: 6 × 7
##   date_time           latitude longitude receiver  station_name transmitter_name
##   <dttm>                 <dbl>     <dbl> <chr>     <chr>        <fct>           
## 1 2011-01-23 16:41:31    -19.3      147. VR2-5052  E18          Ana             
## 2 2011-01-23 16:43:35    -19.3      147. VR2-5052  E18          Ana             
## 3 2011-01-23 16:48:25    -19.3      147. VR2-5052  E18          Ana             
## 4 2011-01-23 21:07:34    -19.3      147. VR2W-104… E2           Ana             
## 5 2011-01-23 21:12:06    -19.3      147. VR2W-104… E2           Ana             
## 6 2011-01-24 10:57:27    -19.3      147. VR2W-104… E1           Ana             
## # ℹ 1 more variable: date <date>


lubridate

  • lubridate is an easy way to convert date and time data into a form that R can recognise
  • Allows for calculation of durations and intervals between dates.
  • Recognises multiple date time formats and parses them to a standardised ‘POSIX’ format that R uses (ymd for dates; ymd_hms for date and time parsing)
  • These features are very important when working with spatio-temporal datasets like telemetry data

Currently in our blacktip dataset the “date_time” column is in the Universal Coordinated Time Zone (UTC). Let’s use lubridate to convert this column into the ‘POSIX’ format and into the local date time (i.e. UTC + 10 hours).

library(lubridate)

blacktip <-
  blacktip %>% 
  mutate(local_date_time = with_tz(date_time, tzone = "Australia/Brisbane")) %>% # convert to local "Australia/Brisbane" date time (UTC + 10hrs)
  mutate(date = date(local_date_time)) # use lubridate to update local date time into a date field


Data visualisation using ggplot2

ggplot2 is a powerful data visualization package for the R programming language. The package makes it very easy to generate some very impressive figures and utilise a range of colour palettes, taking care of many of the fiddly details that can make plotting graphs in R a hassle.

The system provides mappings from your data to aesthetics which are used to construct beautiful plots.

Documentation for ggplot2 can be found here.

There is also this awesome cheetsheet for ggplot2


ggplot2 grammar

The basic idea: independently specify plot building blocks and combine them to create just about any kind of graphical display you want.

Building blocks of a graph include:

  • data
  • aesthetic mapping
  • geometric object
  • statistical transformations
  • scales
  • coordinate system
  • position adjustments
  • faceting

Aesthetic Mapping

In ggplot2, aesthetic means “something you can see”. Aesthetic mapping (i.e., with aes()) only says that a variable should be mapped to an aesthetic. It doesn’t say how that should happen. For example, when mapping a variable to shape with aes(shape = x) you don’t say what shapes should be used. Similarly, aes(color = z) doesn’t say what colors should be used. Describing what colors/shapes/sizes etc. to use is done by modifying the corresponding scale.

In ggplot2 scales include:

  • position (i.e., on the x and y axes)
  • color (“outside” color)
  • fill (“inside” color)
  • shape (of points)
  • linetype
  • size

Each type of geom accepts only a subset of all aesthetics–refer to the geom help pages to see what mappings each geom accepts. Aesthetic mappings are set with the aes() function.


Geometic Objects (geom)

Geometric objects are the actual marks we put on a plot. Examples include:

  • points (geom_point, for scatter plots, dot plots, etc)
  • lines (geom_line, for time series, trend lines, etc)
  • boxplot (geom_boxplot, for, well, boxplots!) A plot must have at least one geom; there is no upper limit. You can add a geom to a plot using the + operator

You can get a list of available geometric objects using the code below:

help.search("geom_", package = "ggplot2")


In the below script we call the data set we have just made (blacktip) and then pipe it into the ggplot() function. We than tell ggplot that we want to plot a box plot.

library(ggplot2)   

blacktip %>%
  group_by(transmitter_name, date) %>% 
  summarise(daily_detections = n()) %>% # use summarise to calculate numbers of detections per day per animal
  ggplot(mapping = aes(x = transmitter_name, y = daily_detections)) + # define the aesthetic map (what to plot)
  xlab("Tag") + ylab("Number of detections per day") +
  geom_boxplot() # define the geometric object (how to plot it).. in this case a boxplot
## `summarise()` has grouped output by 'transmitter_name'. You can override using
## the `.groups` argument.

A common plot used in passive acoustic telemetry to assess temporal patterns in detection is the ‘abacus plot’. This plot can help quickly assess which animals are being detected consistently within your array, and identify any temporal or spatial patterns in detection frequency.

We can adapt the above script to create an abacus plot using our blacktip dataset.

blacktip %>%
  ggplot(mapping = aes(x = local_date_time, y = transmitter_name)) + 
  xlab("Date") + ylab("Tag") +
  geom_point()

Additional Task: Now that you’ve plotted the raw dates, can you figure out how to plot daily detections of our tagged sharks

We can also use the facet_wrap() function to explore the detection data further and look at how animals were detected at each reciever.

blacktip %>%
  ggplot(mapping = aes(x = local_date_time, y = station_name)) + 
  xlab("Date") + ylab("Receiver station") +
  geom_point() +
  facet_wrap(~transmitter_name, nrow=1) # This time plot seperate boxplots for each shark

Additional Task: Can you now plot this with a different colour for each shark?



Back to top



Working with Spatial objects using sf, ggspatial and mapview


R offers a variety of functions for importing, manipulating, analysing and exporting spatial data. Although one might at first consider this to be the exclusive domain of GIS software, using R can frequently provide a much more lightweight, yet equally effective solution that embeds within a larger analytic workflow.

One of the tricky aspects of pulling spatial data into your analytic workflow is that there are numerous complicated data formats. In fact, even within R itself, functions from different user-contributed packages often require the data to be structured in very different ways. The good news is that just like the tidyverse package family, efforts are underway to standardize spatial data classes in R.

This movement is facilitated by sf, an important base package for spatial operations in R. It provides definitions for basic spatial classes (points, lines, polygons, pixels, and grids) in an attempt to unify the way R packages represent and manage these sorts of data, and uses grammer that can be integrated into tidyverse script. It also includes some core functions for creating and manipulating these data structures. The hope is that all spatial R packages will use (or at least provide conversions to) the ‘Spatial’ data class and its derivatives, as now defined in the sf package.

Here is a very useful style guide for coding using the sf package.


Coordinate Reference Systems (CRS)

Central to working with spatial data, is that these data have a coordinate reference system (CRS) associated with it. Geographical CRS are expressed in degrees and associated with an ellipse, a prime meridian and a datum. Projected CRS are expressed in a measure of length (meters) and a chosen position on the earth, as well as the underlying ellipse, prime meridian and datum.

Most countries have multiple coordinate reference systems, and where they meet there is usually a big mess — this led to the collection by the European Petroleum Survey Group (EPSG) of a geodetic parameter dataset.

The EPSG list among other sources is used in the workhorse PROJ.4 library, and handles transformation of spatial positions between different CRS. This library is interfaced with R in the rgdal package, and the CRS is defined partly in the proj package and partly in rgeos.

In the next step, we will convert our blacktip dataset (blacktip) into a spatial object and specify the CRS. We therefore need to refer to the correct CRS information associated with the spatial data.

For simplicity, each projection can be referred to by a unique ID from the European Petroleum Survey Group (EPSG) geodetic parameter dataset. You can find the relevant EPSG code for your coordinate system from this website. There, simply enter in a key word in the search box and select from the list the correct coordinate system. There is a map image in the top right of the site to help you.

The equivalent EPSG code for WGS 84 is 4326


The sf package

The sf package makes it easy to convert any data frame into a spatial object which we can then plot and explore using other packages. The conversion from data frame to a spatial object can be done easily using the st_as_sf() function. Lets convert two data sets into spatial objects so we can plot them out. The main information we need to provide to convert it to a spatial object will be the names of the columns that denote the coordinates, and the CRS for the data. Our receiver coordinates, and hence detection coordinates, were recorded in the WGS 84 geographic datum in decimal degrees, which is the equivalent EPSG code of 4326.

library(sf)

# Import datasets
blacktip <- read_csv('Data/Session 1/Blacktip_ClevelandBay.csv')
statinfo <- read_csv('Data/Session 1/Station_information.csv')

cb_stations <- 
  statinfo %>% 
  filter(installation %in% "Cleveland Bay") %>% 
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326, remove = F)

blacktip_sf <-
  blacktip %>% 
  group_by(transmitter_name, station_name, longitude, latitude) %>% 
  summarise(num_det = n()) %>% 
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326, remove = F)


Now if you look at the objects we have created, it should give you some information on what kind of spatial object it has created (POINTS), and it will provide information on what CRS has been assigned to the coordinate data set. The data itself should look very similar to a tibble object which should be familiar to you now. The addition of the geometry column should give you a hint that it is now a spatial point object.

head(blacktip_sf)
## Simple feature collection with 6 features and 5 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 146.8771 ymin: -19.29675 xmax: 146.9174 ymax: -19.27527
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 6
## # Groups:   transmitter_name, station_name, longitude [6]
##   transmitter_name station_name longitude latitude num_det
##   <chr>            <chr>            <dbl>    <dbl>   <int>
## 1 Ana              E1                147.    -19.3     927
## 2 Ana              E18               147.    -19.3      54
## 3 Ana              E19               147.    -19.3     165
## 4 Ana              E2                147.    -19.3      91
## 5 Ana              E20               147.    -19.3      76
## 6 Ana              E21               147.    -19.3      42
## # ℹ 1 more variable: geometry <POINT [°]>


Plotting a spatial object using ggplot2

Now that we have a spatial object, we can simply visualise it using the ggplot2 package, and using the geom_sf aesthetic.

ggplot(cb_stations) +
  geom_sf()



The ggspatial package

We can also now use other packages to help make very nice, publication ready maps. The ggspatial package is a very useful package that allows to integrate basemaps to your plots. Adding basemaps means you can do more complex mapping without the need to import multiple shapefiles. These basemaps also include satellite imagery which means you can produce nice maps with a variety of basemaps in the background. The grammar used by ggspatial is similar to that of ggplot2, so if you are familiar with that plotting package, the way you build a map is similar.

There are a range of basemaps available to users through various providers like ESRI, Carto, OpenStreetMap, Mapbox and even Google. However, keep in mind some of these providers (like Google and Mapbox) require users to register and purchase a map token to allow access. Lets try to plot our receiver stations with a basic basemap using the OpenStreetMap provider. We do this by using the annotation_map_tile() function, followed by adding a spatial layer using layer_spatial().

library(ggspatial)

ggplot() +
  annotation_map_tile(type = "osm", zoom = 12) +
  layer_spatial(data = cb_stations)

You will notice, you now have an extra folder in your working directory called rosm.cache. This is the folder ggspatial uses to cache basemap images for your locations and zoom level. This reduces the number of times we have to download the basemap images if we are plotting the same map multiple times. Now lets explore other basemap options that may be useful for making professional looking plots:

## Lets explore other basemaps

ggplot() +
  annotation_map_tile(type = "cartolight", zoom = 12) +
  layer_spatial(data = cb_stations)

## lets try non-standard basemaps (ESRI satellite imagery)

esri_sat <- paste0('https://services.arcgisonline.com/arcgis/rest/services/',
                   'World_Imagery/MapServer/tile/${z}/${y}/${x}.jpeg')

ggplot() +
  annotation_map_tile(type = esri_sat, zoom = 12) +
  layer_spatial(data = cb_stations)

## Other non-standard basemaps (ESRI topographical imagery)

esri_topo <- paste0('https://services.arcgisonline.com/arcgis/rest/services/',
                   'World_Topo_Map/MapServer/tile/${z}/${y}/${x}.jpeg')

ggplot() +
  annotation_map_tile(type = esri_topo, zoom = 12) +
  layer_spatial(data = cb_stations)

Just like in gglot2 you can now add more layers to the map, and control the aesthetics of the layers in the same way. Lets add the detection data to this map. We can also add other features in the map like a scale bar using the annotation_scale() function. Lets plot a bubble plot showing numbers of detections for each individual tracked:

ggplot() +
  annotation_map_tile(type = esri_topo, zoom = 12) +
  layer_spatial(data = blacktip_sf, aes(size = num_det, col = transmitter_name)) +
  layer_spatial(data = cb_stations, pch = 21) +
  facet_wrap(~transmitter_name, nrow = 1) +
  annotation_scale() +
  theme(legend.position = "bottom")



The mapview package

We can also use the mapview package to plot our spatial objects as interactive maps. The mapview package provides functions to very quickly create interactive visualisations of spatial data. Mapping can be very easy, and like with ggspatial you can add different layers to build more complex interactive maps.

library(mapview)

mapview(cb_stations)


Now lets add the detection data onto the map. We can use the burst parameter to enable subsetting and seperately plot different subsets interactively. We can use other functions from the leaflet package to modify our interactive map. Just like ggplot2, we can layer multiple mapview objects by just adding them in sequence. Explore all the settings in the mapview() function to customise the difference aspects of your map.

m <-
  mapview(cb_stations, 
        alpha.reg = 0, 
        alpha = 1, 
        color = "grey", 
        map.types = "Esri.WorldImagery", 
        legend = F, homebutton = F, 
        cex = 5) +
  mapview(blacktip_sf, 
          zcol = "transmitter_name", 
          alpha = 0, 
          alpha.reg = 1,
          burst = T, 
          legend = F, homebutton = F, 
          cex = 5)

m

If you have a closer look at our interactive map (m), you’ll notice it has two components:

m@object

m@map

The m@object component houses all the raw spatial data that the interactive map plots, and the m@map component has the actual map. We can use other functions from the leaflet package to make other adjustments to our @map component to make is easier to interact with. Here we will use the addLayersControl() function to modify how we can interact with the subsets of detection data:

library(leaflet)

mm <-
  m@map %>% 
    addLayersControl(
        baseGroups = unique(blacktip_sf$transmitter_name),
        options = layersControlOptions(collapsed = FALSE)) %>%
    hideGroup(unique(blacktip_sf$transmitter_name))

mm


We can then use the mapshot() function in the mapview package to save our interactive map as a html file or a png output. You can then share the html version of the output to collaborators, or upload them on websites for others to explore your data interactively.

mapview::mapshot(mm, url = "Blacktip_interactive_map.html", remove_controls = NULL, selfcontained = TRUE)



Back to top



Session 2

Working with satellite telemetry data


Structure of satellite tag data


Satellite tracking of sharks has revolutionized our understanding of these marine predators. By harnessing advanced satellite technology and specialised tagging devices, researchers can monitor the movements, behaviors, and migratory patterns of various shark species with unprecedented precision. This invaluable tool not only sheds light on the ecological roles of sharks but also contributes to their conservation by identifying critical habitats, migration routes, and overlap with potential threats. Satellite tracking plays a pivotal role in bridging the gap between scientific knowledge and effective shark conservation efforts in our oceans. Not all satellite tracking devices work the same way. In marine research, there are three main types of tags used to track sharks and rays, and each come with their own data structure:


ARGOS tags


ARGOS (Advanced Research and Global Observation Satellite) tags are essential tools for tracking sharks in the vast expanse of the world’s oceans. These tags estimate a shark’s position by receiving signals from a network of ARGOS satellites orbiting the Earth. Each ARGOS tag is equipped with a transmitter that periodically sends out signals containing the tag’s unique identification number and other data. When the tag’s signals reach an ARGOS satellite passing overhead, the satellite records the time and location of the signal reception. By using the Doppler effect of frequency of the transmitted signal vs the received signal (from multiple satellites), the ARGOS system calculates the tag’s position with a certain degree of accuracy. This position estimate is then relayed to researchers on the ground, providing valuable data about the shark’s whereabouts, including latitude and longitude coordinates. While ARGOS tags offer global coverage, they may have limitations in terms of location accuracy compared to GPS tags and rely on animals coming up to the surface regularly, making them more suitable for tracking sharks with broader ranges and less need for high precision. Often ARGOS tags are also combined with GPS transmitters to supplement the positional data with higher accuracy information.

Raw ARGOS data are often pre-processed using a number of algorithms like Kalman filters and least squares methods to refine and define the accuracy of positional data obtained from ARGOS tags. These advanced mathematical techniques allow us to correct and improve the accuracy of the initial estimates provided by the ARGOS system. Kalman filters, for instance, model the movement of the tagged animal over time, incorporating factors like ocean currents and atmospheric conditions to produce more reliable position estimates. Least squares methods minimize the discrepancy between observed ARGOS positions and expected positions based on the animals known characteristics. Once filtered, each estimated position is associated with a specific ‘Location Class’ that are either 3, 2, 1, 0, A, B, Z (and G if GPS data is included).


Data structure:

The structure for data obtained using ARGOS devices after running it through a post-processing filter includes: tag ID (often PTT), date and time of detection, estimated latitude, estimated longitude, location class of estimated error; if the data has been processed using a Kalman filter, the error is defined by additinal variables including error radius, semi-major axis, semi-minor axis, ellipse orientation. Tags used in marine tracking also collect additional environmental data including depth and water temperature, and can be set to collect other biological data including (but not limited to) acceleration.



Back to top



GPS tags


GPS (Global Position System) tags are invaluable for accurately determining the positions of tracked sharks and rays. These tags rely on a constellation of GPS satellites orbiting the Earth. Each GPS tag is equipped with a GPS receiver that constantly listens for signals from multiple GPS satellites. By measuring the time it takes for signals to reach the tag from different satellites, the tag can calculate its precise latitude and longitude coordinates. GPS tags provide real-time, high-resolution location data, making them ideal for tracking sharks with fine-scale movements or those inhabiting nearshore environments. This technology enables researchers to monitor a shark’s every move, from its foraging behavior to its migration patterns, with a level of detail that is unmatched by other tracking methods. GPS tags are particularly valuable for understanding the spatial ecology of coastal and territorial sharks. Again, like the ARGOS tags these tags also rely on animals coming up to the surface regularly, and are more suitable for tracking non-benthic or pelagic species. Like with ARGOS tags, GPS tags are often also paired with other technologies (e.g., GSM mobile phone technology) to increase spatial coverage and increase battery life of tags.


Data structure:

The structure for data obtained using GPS devices is a lot more simpler than other formats and often includes at the least: tag ID (often PTT), date and time of detection, estimated latitude, estimated longitude. Like with ARGOS tags, GPS tags for marine research also often collect a range of environmental data including depth and water temperature, and can be set to collect other biological data including (but not limited to) acceleration.



Back to top



GLS tags


GLS (Geolocation sensor) tags are a specialized tracking tags that estimates a shark’s position based on the timing of natural light patterns, specifically sunrise and sunset. These tags are often used for tracking oceanic and pelagic sharks that roam over vast distances, but are also useful for benthic species that don’t surface frequently. A GLS tag is equipped with light sensors that record the daily variations in ambient light levels. By analyzing the timing and duration of these light patterns, the tag can estimate the shark’s latitude and longitude. The estimation of coordinates from GLS tags are done using one of two methods;

  1. Threshold methods: use two successive twilight events (the times at which the recorded light level crosses a predetermined threshold) to produce a location estimate.
  2. Curve-fitting methods: use rate of change in light level during the twilight period to derive a location estimate using a single twilight.


In GLS tags used in marine environments, the light data is also supplimented by the tag collecting light attenuation with depth to provide more accurate estimates of twighlight. Light geolocator methods have large errors in location estimates. GLS tags do not provide real-time location data and are generally less accurate than GPS tags. Nevertheless, they are suitable for more benthic species or sharks and rays that may not tolerate larger, more intrusive tags, and they have contributed significantly to our understanding of long-distance migrations and seasonal behaviors. Like the ARGOS and GPS tags, GLS tags in shark research are often packaged with other sensors (e.g., temperature, pressure, accelerometers) to transmit more information on the animal’s environment, and are often deployed on timed releases to detach from the tagged animal before transmitting information back to the researchers.







Data structure:

GLS tag data need to be processed to estimate most likely coordinates based on the light data. In most cases, tag manufacturers provide their own post-processing software (e.g., Wildlife Computers GPE3 algorithm). These software often use state-space or maximum likelihood algorithms to improve the estimation of locations based on other data collected by the tag (i.e., water temperature, depth, deployment and pop-off location, ARGOS/GPS based positional data). The software uses the combination of light, maximum depth, water temperature, and any other positional data provided to compare with known bathymetry and seasonal temperature maps in the region of the study site. This process helps improve estimated coordinates, and provides a measure or error for each estimated location.

The structure for data obtained using GLS devices after running it through a post-processing software includes: tag ID (often PTT), date and time of estimated location, estimated latitude, estimated longitude, error in latitudinal estimate and longitudinal estimate. The tags used in marine tracking will also collect and transmit environmental data including depth and water temperature, and can be set to collect other biological data including (but not limited to) acceleration.



Back to top



Processing satellite tag data using aniMotum


In this session we will work with a small subset of tracking data collected by Brad Norman and his team at ECOCEAN. The data consists of tracking data from two Whale Sharks (Rhincodon typus) tagged with ARGOS tags at the Thaa Atoll, the Maldives, in 2017. The data shows the large oceanic dispersals these species conduct across the Indian Ocean. The data have been processed through a Kalman filter, and provides coordinates of both shark movements, and have some estimates of error for each location. We will use this dataset to further refine positions, and predict positions at regular intervals, and calculate measures of movement to get some insight into the behaviours of the two individuals.


You can download the dataset we will use for this session from here. The link should download a zip file with one .csv file that we will use during this session.



We will use the aniMotum package developed by Dr. Ian Jonsen. This package provides a quick and easy means to refine positions by integrating the error associated with location data, and modelling the data to be able to predict locations at fixed intervals. The package also provides a means to explore potential movement behaviours using the characteristics of the track. For more details on the package you can explore the vignettes provided with the package, and also explore its applications in this paper.


The package has a clear workflow, and we can walk through the first 4 steps to process and visualise our data:


  1. Modify our data to the expected input format

  2. Choose and fit an appropriate State-space movement process model

  3. Check our model fit

  4. Visualise our model estimates


Lets begin our data processing by first reading in the data, and formatting it so that the package can read it properly:

library(tidyverse)
library(aniMotum)

raw_data <- read_csv('https://raw.githubusercontent.com/vinayudyawer/SEA-workshop2023/main/data/Session%202/Thaa_Whalesharks_ECOCEAN.csv')

head(raw_data, n = 4)
## # A tibble: 4 × 11
##   DeployID   PTT Latitude Longitude Loc.Class Loc.type Loc.DateTime       
##   <chr>    <dbl>    <dbl>     <dbl> <chr>     <chr>    <dttm>             
## 1 M-130    13596     2.39      73.3 3         Argos    2017-03-28 20:30:00
## 2 M-130    13596     2.40      73.4 2         Argos    2017-03-30 10:10:16
## 3 M-130    13596     2.41      73.4 B         Argos    2017-03-31 06:54:54
## 4 M-130    13596     2.39      73.3 3         Argos    2017-04-04 19:40:00
## # ℹ 4 more variables: Error.Radius <dbl>, `Semi-major.Axis` <dbl>,
## #   `Semi-minor.Axis` <dbl>, Ellipse.Orientation <dbl>
## Now lets format the data using transmute() so aniMotum can read it properly

tagdat <-
  raw_data %>% 
  transmute(id = DeployID,
            date = Loc.DateTime,
            lc = Loc.Class,
            lon = Longitude,
            lat = Latitude,
            smaj = `Semi-major.Axis`,
            smin = `Semi-minor.Axis`,
            eor = Ellipse.Orientation)

head(tagdat, n = 4)
## # A tibble: 4 × 8
##   id    date                lc      lon   lat  smaj  smin   eor
##   <chr> <dttm>              <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 M-130 2017-03-28 20:30:00 3      73.3  2.39   262    64   104
## 2 M-130 2017-03-30 10:10:16 2      73.4  2.40  1653    73    75
## 3 M-130 2017-03-31 06:54:54 B      73.4  2.41  2868  2709    82
## 4 M-130 2017-04-04 19:40:00 3      73.3  2.39   403    53    68

Now that we have formatted the data correctly, lets move onto the second step; choosing and fitting a movement model. This step allows us to model the data we have, and gives us the ability to then predict positions along the animals movement path to produce positions at fixed time periods. This becomes important when we want to calculate metrics of movements to help define movement behaviours. There are three main movement processes that aniMotum provides:

  1. Random Walk (RW) models - where movements between positions are modeled to be random in direction and magnitude.
  2. Correlated Random Walk (CRW) model - where movements between positions are modeled as random and correlated in direction and magnitude.
  3. Continuous-time Move Persistence (MP) model - where movements between positions are modeled as random with correlations in direction and magnitude that vary in time.

You can explore the specifics of this package from the vignette here.

## Lets now now fit a Continuous-time Move Persistence (MP) model to our data

fit <-
  fit_ssm(x = tagdat, 
          vmax = 1.3, model = "mp")




Back to top



Visualising satellite tag data using ggspatial






Back to top



Session 3

Working with passive acoustic telemetry data

In this session we will go through a brief walk through of how we can use the VTrack R package to quickly format and analyse large acoustic tracking datasets. A lot of the functions here do similar analyses to the ones you learned in the previous session. We will then go through a new R package called remora that helps users to interactively explore their data as well as append environmental data to detections to further your analysis of animal movements.

Here we are just arming you with multiple tools to be able to analyse your data. Which analysis (and thus R package) is more appropriate and suitable to your dataset will depend on your study design, research questions and data available. For this session, we will use the same data you worked on in session 2, however we will use the IMOS Workshop_Bull-shark-sample-dataset in the data folder you have downloaded.



Structure of acoustic telemetry data





Back to top



Explore patterns of detection with VTrack





Back to top



Explore your data with remora







Signoff!

This is where we end our R workshop! There may have been a few bits of code that you had trouble with or need more time to work through. We encourage you to discuss these with me as well as others at the workshop to help get a handle on the R code.


If you have any comments or queries reguarding this workshop feel free to contact me:

Happy Tracking!